Implicit SIMD, Parallel Kernels, Vectorization, Gang Programming

Feeds to Scour
SubscribedAll
Scoured 72581 posts in 2.86 s
The Story on ISPC (Intel SPMD Program Compiler)
pharr.org·1d·
Discuss: Hacker News
🔀SIMD Programming
Preview
Report Post
CPU-less parallel execution of lambda calculus in digital logic
arxiv.org·15h
🔓Lock-Free Programming
Preview
Report Post
Pushing the Packed SIMD Extension Over the Line: An Update on the Progress of Key RISC-V Extension
semiwiki.com·1d
📏Picolibc
Preview
Report Post
istmarc/tenseur: C++23 Tensor, neural networks and mathematical library
github.com·2h·
Discuss: r/cpp
⚙️XLA
Preview
Report Post
Hardware-Aware Reformulation of Convolutions for Efficient Execution on Specialized AI Hardware: A Case Study on NVIDIA Tensor Cores
arxiv.org·15h
🔬Deep Learning
Preview
Report Post
I Made Zig Compute 33 Million Satellite Positions in 3 Seconds. No GPU Required.
atempleton.dev·1d·
Discuss: Hacker News
🛣️Highway
Preview
Report Post
Computer-on-Modules for an efficient entry into rugged embedded edge AI applications
einpresswire.com·1d
🔌Embedded Systems
Preview
Report Post
Generative Thermodynamic Computing
link.aps.org·9h
🔲Cellular Automata
Preview
Report Post
FlashAttention 4: Faster, Memory-Efficient Attention for LLMs
digitalocean.com·8h
🔄Hardware Transactional Memory
Preview
Report Post
Scientific Computing in Rust Monthly #14
scientificcomputing.rs·8h
🦀Rust Macros
Preview
Report Post
Why AI Needs GPUs and TPUs: The Hardware Behind LLMs
blog.bytebytego.com·2d
Hardware Acceleration
Preview
Report Post
ANN v3: 200ms p99 query latency over 100 billion vectors
turbopuffer.com·20h·
Discuss: Hacker News
🌊Memory Bandwidth
Preview
Report Post
**Abstract:** This research proposes a novel approach to dynamic resource allocation within CUDA Streaming Multiprocessors (SMs) to enhance performance and e...
freederia.com·1d
🧩mimalloc
Preview
Report Post
Building a mini PyTorch in C++ from scratch as a high school student...
dev.to·7h·
Discuss: DEV
🧮Vector Databases
Preview
Report Post
SHADOW: Simultaneous Multi-Threading Architecture with Asymmetric Threads
danglingpointers.substack.com·1d·
Discuss: Substack
🧵Lightweight Threads
Preview
Report Post
Setting Up A Cluster of Tiny PCs For Parallel Computing - A Note To Myself
kenkoonwong.com·1h·
Discuss: Hacker News
📜MultiPaxos
Preview
Report Post
Binary Algorithms
exystence.net·20h
⏭️Skip Lists
Preview
Report Post
Everyone deserves a better computer | Ahead Computing
aheadcomputing.com·4h·
Discuss: Hacker News
🏗Computer Architecture
Preview
Report Post
Addressing Critical Tradeoffs In NPU Design
semiengineering.com·12h
🏗️System Design
Preview
Report Post
CUDA Programming: From Zero to GPU Kernels
pythongiant.github.io·9h·
Discuss: Hacker News
🎮SIMT Execution
Preview
Report Post

Keyboard Shortcuts

Navigation
Next / previous item
j/k
Open post
oorEnter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
gh
Interests
gi
Feeds
gf
Likes
gl
History
gy
Changelog
gc
Settings
gs
Browse
gb
Search
/
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc

Press ? anytime to show this help